An Overview of Resources and Basic Tools for the Processing of Serbian Written Texts
نویسندگان
چکیده
In this paper we describe the resources and tools for the processing of texts written in Serbian. Most of the resources have been developed within the University of Belgrade NLP group located at the Faculty of Mathematics. The main features of these resources, namely available monolingual and multilingual corpora and various e-dictionaries are briefly described. The use of Intex, the main tool of the NLP group, for the recognition of unknown words, text tagging, building local grammars and disambiguation is outlined.
منابع مشابه
Processing Serbian Written Texts: An Overview of Resources and Basic Tools
In this paper we describe the resources and tools for the processing of texts written in Serbian that have been developed within the University of Belgrade NLP group located at the Faculty of Mathematics. The main features of these resources, namely available monolingual and multilingual corpora and various e-dictionaries are briefly described. The use of Intex, the main tool of the NLP group, ...
متن کاملMetadiscourse Markers Revisited in EFL Context: The Case of Iranian Academic Learners’ Perception of Written Texts
Moving in line with the postulation that metadiscourse (MD) markers help transform a dry and tortuous piece of text into a coherent and reader-friendly one, the researchers in the current study attempted to investigate the effect different metadiscourse markers might have on Iranian EFL learners’ perception of written texts. To this end, 120 undergraduate English students were given three diffe...
متن کاملThe Extent of Using the Basic Vocabulary in the First Grade Quran Textbook
The Extent of Using the Basic Vocabulary in the First Grade Quran Textbook S. B. Alavi Moghaddam, Ph.D. Textbooks need to be written in such a way that their readers can understand the written texts. One way of ensuring this objective in first grade textbooks would be the use of basic vocabulary, as determined by Ne'matzadeh, et.al. (1384). Considering the importance of the Quran text...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملAutomatic Recognition of Composite Verb Forms in Serbian
In this paper, we will present the work on building a shallow parser for recognizing composite verb forms in Serbian – the forms that consist of an auxiliary verb and a main verb. The parser is made in Unitex, a corpus processing software, in the form of local grammars that rely on using morphological dictionaries of Serbian. The model was tested on a small corpus of texts, both written in Serb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003